%conclusion
In this paper,
we conducted an extensive evaluation of the performance and energy
tradeoffs among choices in device technologies, 3D integration,
microarchitecture, and scheduling under constraints imposed by
realistic yield models, thermal bounds and exploitable application
parallelism. 
We showed that, with 3D integration, steep-slope based
devices can extend the space of viable designs and achieve competitive
performance for highly parallel applications. We highlight how, with
further technology scaling, the range of applications for which
steep-slope devices are appropriate grows, while the portion of the
design space where CMOS is optimal shrinks to a point where only a
small number of CMOS cores may be desirable. Finally, we present a
method for capitalizing on these trends by developing an intelligent
scheduling approach for hybrid CMOS-TFET systems that yields  mean improvements
ranging from 17\% for a high end server workloads running at around 90$^\circ$C to over 
160\% for embedded systems running below 60$^\circ$C.
